2 research outputs found
Neural Likelihoods via Cumulative Distribution Functions
We leverage neural networks as universal approximators of monotonic functions
to build a parameterization of conditional cumulative distribution functions
(CDFs). By the application of automatic differentiation with respect to
response variables and then to parameters of this CDF representation, we are
able to build black box CDF and density estimators. A suite of families is
introduced as alternative constructions for the multivariate case. At one
extreme, the simplest construction is a competitive density estimator against
state-of-the-art deep learning methods, although it does not provide an easily
computable representation of multivariate CDFs. At the other extreme, we have a
flexible construction from which multivariate CDF evaluations and
marginalizations can be obtained by a simple forward pass in a deep neural net,
but where the computation of the likelihood scales exponentially with
dimensionality. Alternatives in between the extremes are discussed. We evaluate
the different representations empirically on a variety of tasks involving tail
area probabilities, tail dependence and (partial) density estimation.Comment: 10 page
Cumulative Distribution Functions As The Foundation For Probabilistic Models
This thesis discusses applications of probabilistic and connectionist models for
constructing and training cumulative distribution functions (CDFs). First, it is shown
how existing tools from the copula literature can be combined to build probabilistic
models. It is found that this simple construction leads to numerical and scalability
issues that make training and inference challenging.
Next, several innovative ideas, combining neural networks, automatic differentiation
and copula functions, introduce how to assemble black-box probabilistic
models. The basic building block is a cumulative distribution function that is straightforward
to construct, composed of arithmetic operations and nonlinear functions.
There is no need to assume any specific parametric probability density function
(PDF), making the model flexible and normalisation unnecessary. The only requirement
is to design a computational graph that parameterises monotonically
non-decreasing functions with a constrained range. Training can be then performed
using standard tools from any neural network software library.
Finally, factorial hidden Markov models (FHMMs) for sequential data are
presented. It is shown how to leverage cumulative distribution functions in the
form of the Gaussian copula and amortised stochastic variational method to encode
hidden Markov chains coherently. This approach enables efficient learning and
inference to model long sequences of high-dimensional data with long-range dependencies.
Tackling such complex problems was impossible with the established
FHMM approximate inference algorithm.
It is empirically verified on several problems that some of the estimators introduced
in this work can perform comparably or better than the currently popular
models. Especially for tasks requiring tail-area or marginal probabilities that can be
read directly from a cumulative distribution function